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Abstract 

We present parametric higher-order abstract syntax (PHOAS), a 
new approach to formalizing the syntax of programming languages 
in computer proof assistants based on type theory. Like higher- 
order abstract syntax (HOAS), PHOAS uses the meta language’s 
binding constructs to represent the object language’s binding con¬ 
structs. Unlike HOAS, PHOAS types are definable in general- 
purpose type theories that support traditional functional program¬ 
ming, like Coq’s Calculus of Inductive Constructions. We walk 
through how Coq can be used to develop certified, executable pro¬ 
gram transformations over several statically-typed functional pro¬ 
gramming languages formalized with PHOAS; that is, each trans¬ 
formation has a machine-checked proof of type preservation and 
semantic preservation. Our examples include CPS translation and 
closure conversion for simply-typed lambda calculus, CPS transla¬ 
tion for System F, and translation from a language with ML-style 
pattern matching to a simpler language with no variable-arity bind¬ 
ing constructs. By avoiding the syntactic hassle associated with 
first-order representation techniques, we achieve a very high degree 
of proof automation. 

Categories and Subject Descriptors F.3.1 [Logics and meanings 
of programs]: Mechanical verification; D.2.4 [Software Engineer¬ 
ing ]: Correctness proofs, formal methods, reliability; D.3.4 [Pro¬ 
gramming Languages]: Compilers 

General Terms Languages, Verification 

Keywords compiler verification, interactive proof assistants, de¬ 
pendent types, type-theoretic semantics 

1. Introduction 

Compiler verification is one of the classic problems of formal 
methods. Most all computer scientists understand the importance 
of the problem, as compiler bugs can negate the benefits of any 
techniques for program correctness assurance, formal or informal. 
Unlike most potential subjects for program verification, we have 
clear correctness specifications for compilers, based on the formal 
semantics of programming lang D e E en better, the researchers 
interested in program verification tend already to be quite familiar 
with compilers. 
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None of these points in favor of studying the problem are new; 
they have all been in force for decades, at least. Pondering the 
future 20 years ago, then, it might have seemed reasonable to 
hope that at least the most widely-used production compilers would 
have machine-checked soundness proofs today. Of course, this is 
far from the case; we are not aware of any certified compilers in 
production use today. 

There have been some notably successful research projects. 
Moore (1989) used the Boyer-Moore theorem prover to certify the 
correctness of a language implementation stack for the Piton lan¬ 
guage. However, the Boyer-Moore prover has an important disad¬ 
vantage here: it is implemented as a monolithic set of decision pro¬ 
cedures, all of which we must trust are implemented correctly to 
believe the result of the verification. We can be more confident in 
results produced with provers that, like Isabelle, follow the LCF 
tradition; or that, like Coq, are based on foundational type theo¬ 
ries. Both of these implementation strategies allow the use of small 
proof-checking kernels, which are all we need to trust to believe the 
final results. 

More recently, Leroy (2006) has implemented a certified com¬ 
piler for a subset of C in Coq in the CompCert project. The final 
proof can be checked with a relatively small proof checker which 
amounts to a type-checker for a dependently-typed lambda calcu¬ 
lus. However, the path to this result is a bumpy one. In addition to 
the compiler implementation proper, the implementation includes 
about 17,000 lines of proof. With this as the state-of-the-art, it does 
not seem surprising that most compiler implementers would de¬ 
cline to verify their compilers, leaving that task to researchers who 
specialize in it. 

Can we get the best of both worlds? Can we verify compilers us¬ 
ing cooperating decision procedures and produce easily-checkable 
proof witnesses in the end? In this paper, we suggest a new tech¬ 
nique that removes one barrier in the way of that goal. 

Nearly all compiler verification research being done until 
very recently, including the two projects that we have already 
mentioned, focuses on first-order, imperative programming lan¬ 
guages. However, HOT (higher-order typed) languages like ML 
and Haskell can benefit just as much from certified implemen¬ 
tations, and, of course, they are near and dear to the hearts of 
many people working in program verification. Unfortunately, the 
pantheon of challenges associated with compiler verification only 
grows with the addition of higher-order features and fancy type 
systems. 

Central among these challenges is the question of how to deal 
with nested variable binding constructs in object languages. The 
POPLmark Challenge (Aydemir et al. 2005) helped draw atten¬ 
tion to these issues recently, issuing a challenge problem in mech¬ 
anized language metatheory that drew many solutions employing 
many different binder representation strategies. The solutions were 
largely split between implementations in Coq (Bertot and Casteran 



2004) and Isabelle (Paulson 1994) using first-order variable repre¬ 
sentations, and solutions in Twelf (Pfenning and Schumann 1999) 
using higher-order abstract syntax (Pfenning and Elliot 1988). The 
first-order proofs required much more verbosity surrounding book¬ 
keeping about variables, but the Twelf implementations involved 
more tedious proving for the lemmas that would actually appear in 
a paper proof, as Twelf lacks any production-quality proof automa¬ 
tion. This failing of the first-order solutions seems like an unavoid¬ 
able drawback. The failing of the higher-order solutions seems less 
fundamental, though certainly much more is known about proof as¬ 
sistant support for the logics at the cores of Isabelle and Coq than 
about such support for Twelf’s meta-logic. 

In past work (Chlipala 2007), we tackled these representation 
issues in the context of compiler verification. Language metathe¬ 
ory problems are popular as benchmarks because they can admit 
relatively straightforward pencil-and-paper solutions and because 
most programming languages researchers are already familiar with 
them. At the same time, hardly anyone working outside of pro¬ 
gramming language research, including other computer scientists, 
recognizes their importance. It is also true that more denotational 
approaches to language semantics remove the need for traditional 
syntactic metatheory proofs. If you accept a denotational seman¬ 
tics mapping terms of an object language into a foundational type 
theory as the specification of program meanings, then standard the¬ 
orems like type safety and strong normalization can be “borrowed” 
from the meta language “for free.” 

We presented a certified type-preserving compiler from simply- 
typed lambda calculus to an idealized assembly language used with 
a garbage-collecting runtime system. We proved type preservation 
and semantic preservation for each phase of the compiler, automat¬ 
ing the proofs to a large extent using several decision procedures, 
including support from a new Coq plug-in for automatic genera¬ 
tion of lemmas about operations that rearrange variables in ob¬ 
ject terms. We formalized language syntax and typing rules us¬ 
ing dependently-typed abstract syntax trees, and we used foun¬ 
dational type-theoretic semantics to assign dynamic semantics via 
computable translations to Coq’s Calculus of Inductive Construc¬ 
tions. Unfortunately, the hassles of first-order binder representa¬ 
tions, such as the de Bruijn representation we chose, are only exac¬ 
erbated by adding dependent types. 

It is not obvious how to lessen this burden. Higher-order ab¬ 
stract syntax, as it is usually implemented, allows for writing non¬ 
terminating programs if standard pattern matching facilities are 
also provided. The logic/programming language underlying Coq 
is designed to forbid non-termination, which would correspond to 
logical unsoundness. 

In this paper, we present a retooling of our previous work based 
on a new representation technique, parametric higher-order ab¬ 
stract syntax (PHOAS). PHOAS retains most of the primary ben¬ 
efits of HOAS and first-order encodings. That is, we use the meta 
language to do almost all of our variable book-keeping for us, but 
we are still able to write many useful (provably total) recursive, 
pattern-matching functions over syntax in a very direct way. These 
functions are definable in the Calculus of Inductive Constructions 
(CIC), and we can use Coq to prove non-trivial theorems about 
them, automating almost all of the work with reusable tactics. 

In the next section, we introduce PHOAS and use it to imple¬ 
ment a number of translations between statically-typed lambda cal¬ 
culi in a way that gives us static proof of type preservation. In Sec¬ 
tion 3, we sketch how we used the Coq proof assistant to build 
machine-checked proofs of semantic preservation for the transla¬ 
tions from Section 2. Section 4 provides some statistics on the com¬ 
plexity of our Coq developments, including measurements of how 


much work is needed to extend a CPS translation with new types. 
We wrap up by surveying related work. 

We have added PHOAS support to our Coq language formaliza¬ 
tion library called Lambda Tamer. The source distribution, which 
includes this paper’s main case studies as examples, can be ob¬ 
tained from: 

http://ltamer.sourceforge.net/ 

One final note is in order before we proceed with the main pre¬ 
sentation. While we start off with a few small examples that are 
useful in operational treatments of language semantics, our focus 
is on more denotational methods. Indeed, we make no particular 
claims about the overall benefits of PHOAS in settings where object 
languages are not given meaning by executable translations into a 
meta language. Many effectful programming language features are 
hard to represent in the traditional setting of denotational semantics 
with domain theory, but we can mix type-theoretic and operational 
reasoning to encode these features without using non-trivial do¬ 
main constructions. The features that we handle type-theoretically 
are much easier to reason about, and we can restrict our operational 
reasoning to a “universal” core calculus with good support for ef¬ 
fects. All this is to justify that the type-theoretic semantics approach 
is worth using as the primary source of examples for PHOAS. In 
this paper, our example object languages will be pure with easy 
pure type-theoretic treatments, and we leave the orthogonal issue 
of encoding impure languages for a future paper. 

2. Programming with PHOAS 

We will begin by demonstrating how to write a variety of useful 
programs that manipulate syntax implemented with PHOAS. We 
use a non-ASCII notation that is a combination of Coq, Haskell, 
and ML conventions, designed for clarity for readers familiar with 
statically-typed functional programming. 

2.1 Untyped Lambda Calculus 

We begin with the syntax of untyped lambda calculus by defining a 
recursive type term that follows the usual HOAS convention. 

App term —> term —> term 

Abs : (term —> term) —>term 

The fact that term is a type is indicated by assigning it type *, 
the type of types. While Coq has an infinite hierarchy of universes 
for describing types in a way that disallows certain soundness¬ 
breaking paradoxes, we will collapse that hierarchy in this paper 
in the interests of clarity. The dubious reader can consult our im¬ 
plementation to see the corresponding versions that do not take this 
shortcut. 

App and Abs are the two constructors of terms, corresponding 
to function application and abstraction. The type of Abs does not 
mention any specification of which variable is being bound, as we 
can use the function space of CIC to encode binding structure. Lor 
instance, the identity function \x. x can be encoded as: 

id = Abs (Aa;. x) 

The A on the righthand side of the equation marks a CIC func¬ 
tion abstraction, not the object language abstraction that we are 
trying to formalize. This first example looks almost like cheating, 
since it includes the original term in it, but we can encode any 
lambda calculus term in this way, including the famous infinite¬ 
looping term (Aa:. x x) (Xx. x x ): 

diverge = App (Abs (Aa;. App x x)) (Abs (Aa;. App x x)) 




selfApply 

= \x : term, match x with 

numVars 

term(unit) —> N 


| App xy=> App x y 

numVars(Var ) 

= 1 


1 Abs / =F / (Abs /) 

numVars(App ei e2) 

= numVars ei + numVars 

bad 

= selfApply (Abs selfApply) 

numVars(Abs e) 

= numVars (e ()) 



NumVars 

: Term —> N 

Figure 1. An example divergent term 

NumVars(£) 

= numVars (E unit) 


Coq encodes programs and proofs in the same syntactic class, 
following the Curry-Howard Isomorphism. If we allow the exis¬ 
tence of terms in this general class that do not terminate when run 
as programs, then we can prove any theorem with an infinite loop. 
Splitting programs and proofs into separate classes would be pos¬ 
sible, but it would complicate the metatheory and implementation. 
Moreover, programs that must terminate are simply easier to rea¬ 
son about, and this will be very important when we want to prove 
correctness theorems for program transformations. 

Unfortunately, if Coq allowed the definitions we have developed 
so far, we could write non-terminating programs and compromise 
soundness. We do not even need any facility for recursive function 
definitions; simple pattern-matching is enough. Consider the term 
bad defined in Figure 1, following the same basic trick as the last 
example. No matter how many /^-reductions and simplifications of 
pattern matches we apply, bad resists reducing to a normal form. 

The root of the trouble here is that we can write terms that do 
not correspond to terms of the lambda calculus. Counterexamples 
like bad are called exotic terms. There are a number of tricks for 
building HO AS encodings that rule out exotic terms, including 
meta language enhancements based on new type systems (Fegaras 
and Sheard 1996; Schumann et al. 2001). The technique that 
we will use, PHOAS, does not require such enhancements. It is 
essentially a melding of weak HOAS (Despeyroux et al. 1995; 
Honsell et al. 2001) and the “boxes go bananas” (BGB) (Washburn 
and Weirich 2008) HOAS technique. 

We can illustrate the central ideas by modifying our example 
type definition for this section so that Coq will accept it as valid. 
Coq uses simple syntactic criteria to rule out inductive type defini¬ 
tions that could lead to non-termination. The particular restriction 
that rules out the standard HOAS encoding is the positivity restric¬ 
tion, which says (roughly) that an inductive type may not be used 
in the domain of a nested function type within its own definition. 

A small variation on the original definition sidesteps this rule. 
Instead of defining a type term, we will define an inductive type 
family term(V). Here we see the source of the name “parametric 
higher-order abstract syntax,” as the family term is parametric in 
an arbitrary type V of variables. 
term(V) : * 

Var : V -+ term(V) 

App term(V) —> term(V) —> term(V) 

Abs : (V —> term(V)) —> term(V) 

Var, App, and Abs are the three constructors of terms. The new 
representation strategy differs from the old HOAS strategy only in 
that binders bind variables instead of terns, and those variables are 
“injected” explicitly via the Var constructor. For example, we now 
represent the identity function as: 

id = Abs (A*. Var x) 

If we fix a type V at the top level of a logical development and 
assert some axioms characterizing the properties of variables, we 
can arrive at weak HOAS with the Theory of Contexts (Honsell 


Figure 2. A function for counting variable uses in a tern 


et al. 2001). This would be enough to allow us to build relational 
versions of all of the syntactic operations we care about, but it does 
not support a natural style of functional programming with syntax. 
For instance, it is unclear how to write a recursive function that 
counts the number of variable occurrences in a tern. 

The BGB trick is to take advantage of the meta language’s 
parametricity properties to rule out exotic terns in a way that lets 
us “stash data inside of variables” when we later decide to analyze 
terms in particular ways. In our setting, we accomplish this by 
defining the final type of terms like this: 

Term = W : *. term(V) 

That is, we define a polymorphic type, where the universally 
quantified variable V may be instantiated to any type we like. 
Parametricity is the “theorems for free” property that lets us draw 
conclusions like that the type Vr : *. r —> r is inhabited only 
by the polymorphic identity function. For our running example, 
we can rely on parametricity to guarantee that Terms only “push 
variables around,” never examining or producing them in any other 
way. We are not aware of a formal proof of parametricity for CIC, 
but we can work around this meta-theoretic gap by strengthening 
our theorem statements to require parametricity-like properties to 
hold of particular concrete terms; we consider this issue more in 
Section 3. 

Now we can quite easily define a function that counts the num¬ 
ber of variable occurrences in a term, as in Figure 2. The function 
NumVars is passed a term E that can be instantiated to any con¬ 
crete choice of a variable type. As the only thing we need to know 
about variables is where they are, we can choose to make V the 
singleton type unit. Now we can use a straightforward recursive 
traversal of the term(unit) that results. The most interesting part of 
the definition is where, for the case of numVars(Abs e'), we call 
the meta language function e' on (), the value of type unit, to spec¬ 
ify “which variable we are binding” or, alternatively, which data we 
want to associate with that variable. 

Coq imposes strict syntactic termination conditions on recursive 
function definitions, but the definition here of numVars satisfies 
them, because every recursive call is on a direct syntactic subterm 
of the original function argument. The notion of syntactic subterms 
includes arbitrary calls to functions that are arguments of construc- 

For an example of a non-trivial choice of V, consider the func¬ 
tion in Figure 3, which checks if its term argument is a candidate 
for a top-level ^-reduction. To perform this check, we instantiate 
V as bool and pattern match on the resulting specialized term. We 
can return false immediately if the term is not an abstraction. Oth¬ 
erwise, we apply the body to false, effectively substituting false for 
the abstraction’s argument variable. The term we get by applying 
to false had better be an application of some unknown term ei to a 
variable that has been tagged false (that is, the ; tuient ible). 
If so, we traverse ei, substituting true for each new bound vari¬ 
able we encounter and checking that every free variable is tagged 



i(bool) —> bool 




canEta'(Var b ) = b 

canEta^App ei e2) = canEta^ei) && canEta , (e2) 

canEta'(Abs e) = canEta'(e'true) 

canEta term(bool) —> bool 

canEta(Abs e) = match e false with 

| App ei (Var false) => 
canEta'(ei) 

| _ => false 

canEta(_) = false 
CanEta Term —> bool 

CanEta(-E) = canEta(i5 bool) 

Figure 3. Testing for p-reducibility 


true. If the traversal’s test passes, then we know that the original 
variable never appears in ei, so the original term is a candidate for 
p-reduction. 

We can also write recursive functions that construct terms. We 
consider capture-avoiding substitution as an example. Since we are 
using a higher-order binding encoding, we define the type of terms 
with one free variable like this: 

Terml = VV : *. V —> term(V) 

Now we can implement substitution in a way where, when 
we are trying to build a term for a particular V type, we do our 
intermediate work with the variable type term(V). That is, we tag 
each variable with the term we want to substitute for it. 

subst VV : *. term(term(V)) —> term(V) 
subst(Var e) = e 

subst(App ei e 2 ) = App (subst(ei)) (subst(e 2 )) 

subst(Abs e) = Abs (A*. subst(e' (Var x))) 

Subst Terml —> Term —> Term 
Subst E x E 2 = XV: *. subst(£i (term(V)) ( E 2 V)) 

The subst(Abs e') case is trickiest from a termination checking 
perspective, but the same syntactic subterm rule applies. Any call 
to a function that was an argument to the constructor we are pattern 
matching on is allowed, even if the call is inside a meta language 
binder and uses the bound variable. 

We hope that the examples in this section have provided a good 
sense for how PHOAS supports relatively direct functional def¬ 
initions of syntactic operations. The choice of different variable 
types for different functions provokes some cleverness from the 
programmer, on the order of the effort needed in selecting helper 
functions in traditional functional programming. Nonetheless, the 
convenience advantage of PHOAS over first-order techniques be¬ 
comes clear when we move to formal proofs about the functions 
we define, letting us proceed without proving any auxiliary lem¬ 
mas about variable manipulation. 

While examples of mechanized language formalization have al¬ 
most always been drawn from the array of syntactic metatheory 
properties of languages with operational semantics, in the rest of 
this paper we are concerned instead with proving that code transla¬ 
tions on languages with type-theoretic semantics preserve program 
meaning. In the rest of Section 2, we will show how to define sev¬ 
eral translations on typed lambda calculi, using dependent types to 


Types r 

Variables x 

Terms e ::= |a:| | true | false \ ee \ Xf 

Term functions / 

Figure 4. Syntax for STLC that makes PHOAS explicit 

prove type preservation simultaneously with defining the transla¬ 
tions themselves. Section 3 expands on our results to prove seman¬ 
tic preservation for the same translations. 

2.2 CPS Translation for Simply-Typed Lambda Calculus 

We will start by writing a translation from direct-style simply-typed 
lambda calculus (STLC) into continuation-passing style. As our 
single base type, we choose bool, so that every type is inhabited 
by multiple distinct values, making our final correctness theorem 
in Section 3 non-trivial. 

Types r ::= bool | r —> r 
Variables x 

Terms e x | true | false | e e | Xx. e 

We assume the standard typing rules and omit them here, though 
they are implied by the CIC representation that we choose for 
terms. We have a straightforward algebraic datatype definition of 

type : * 

Bool type 

Arrow type —> type —* type 

We represent terms with a type family term(V), as before. The 
difference is that now choices of V have type type —> * instead of 
type *. That is, we have a different type of variables for each object 
language type. 

term(V) type —> * 

Var Vr : type. V(r) —► term(V) r 
Tru term(V) Bool 

Fals term(V) Bool 

App Vti, 7*2 : type. term(V) (Arrow n r 2 ) 

—> term(V) %i —* term(V) r 2 

Abs : Vti,T 2 : type. (V(t, ) —> term(V) r 2 ) 

—> term(V) (Arrow n t 2 ) 

Term = At : type. VV : type —> *. term(V) r 
This follows the general idea of abstract syntax tree types imple¬ 
mented using generalized algebraic datatypes in, for instance, GHC 
Haskell. The main difference is that, in Haskell, type indices must 
be meta language types, so we might use the Haskell boolean type 
in place of Bool and the Haskell function type constructor in place 
of Arrow. Thus, to write recursive functions over those indices in 
Haskell requires something like type classes with functional depen¬ 
dencies (or the more experimental type operators), rather than the 
direct pattern-matching definitions that are possible with our Coq 
encoding. 

Defining the syntax of every object language with the same sim¬ 
ple, mostly textual inductive type definition mechanism is conve¬ 
nient from a foundational perspective, but it is generally clearer to 
work mostly with syntactic abbreviations closer to those used in 
pencil-and-paper formalisms. Coq even supports the registration of 
arbitrary user-specified recursive descent parsing rules, so we work 
with the same simplification in our implementation, modulo a re¬ 
striction to the ASCII character set. The syntax that we will use for 




Section vars. 

Variable var : type -> Type. 

Inductive term : type -> Type := 
I EVar : forall t. 



I ETrue : term TBool 
I EFalse : term TBool 
I EApp : forall tl t2, 
term (TArrow tl t2) 

-> term tl 
-> term t2 

I EAbs : forall tl t2, 
(var tl -> term t2) 

-> term (TArrow tl t2). 

End vars. 


Figure 5. Coq code to define STLC syntax 

STLC, shown in Figure 4, is a slight modification that exposes the 
relevant parts of variable representation. 

We explicitly inject a variable * into the term type as |*:|, 
and abstractions A / explicitly involve functions / from variables 
to terms. From this point on, we will distinguish between meta 
language and object language lambdas by writing the former as A 
and the latter as the usual A. We will write Xx. e as shorthand for 
A (A*, e). 

Figure 5 shows Coq code for this syntax formalization scheme. 
We use Coq’s “sections” facility to parameterize the term type 
with a type family var without needing to mention var repeatedly 
within the definition. 

The target language for our CPS translation is a linearized form 
of the source language, where functions never return, indicated by 
continuation types of the form r —> 0; and programs are broken 
up into sequences of primitive operations. Since we will give the 
language semantics type-theoretically, we do not bother to include 
a syntactic class of “values,” and we put constants like true and 
false directly in the class of primops. 


Types 

T 

::= bool | r —> 0 | r x r 


Variables 

X 



Terms 

e 

::= halt(*) | x x let p in 

'/ 

Primops 

P 

::= |a;| | true | false | A/ 




j (*,*) | 771* | 7T2* 


Term functions 

f 



We will use let x = 

ei ii 

n e.2 as shorthand for let ei ir 

i A*. e2. 


In the interests of space, we will omit here the Coq definition of 
the mutually inductive PHOAS types for terms and primops. The 
main complication of the CPS language over the source language is 
that terms are represented with a type family cpsTerm(V, r), with 
t : cpsType. While terms do not return directly, the meta language 
type of a term includes the parameter r to determine which type 
of argument “the top-level continuation” is expecting. A primop 
whose value is of type T2 and whose top-level continuation expects 
type n has type cpsPrimop(V, n) T2. 

We can now give a straightforward definition of a CPS transla¬ 
tion that translates almost literally into Coq code. Coq has a notion 
of notation scopes to support overloaded parsing rules, and we will 
take advantage of similar conventions here to shorten the defini¬ 
tion. For instance, the text bool can mean either the source or CPS 
boolean type, depending on context, and we will use the notation 



W : cpsType —> *. 

Vn, *2 : cpsType. 
cpsTerm (V, n) 

—> (V(n) —> cpsTerm(V, r 2 )) 
—> cpsTerm(V, *2) 

letTerm (halt(*)) e 

= ex 

letTerm (*i *2) e 


atTerm (let p in e) e 

= let * = letPrim p e 
in letTerm (e *) e 

letPrim 

VV : cpsType —► *. 

Vti,t 2 ,t : cpsType. 
cpsPrimop(V, n) r 
—> (V(n) —* cpsTerm(V, r 2 )) 
—> cpsPrimop(V, t 2 ) t 

letPrim (A/) e 

= A*. letTerm (/ *) e 

letPrim p e 

= P 


Figure 6. Term splicing 


L-J to indicate both the translation |_tJ of a type and the translation 
|_ej of a term. Our type translation is: 

U : type -*• cpsType 

[boolj = bool 

Ln -> r 2 j = (inj x (N1 — *■ o)) — *■ o 

We traverse type structure, changing all function types into 
continuation types where a return continuation has been added as 
an extra argument. 

The let term form only allows us to bind a primop in a term. To 
define the main term translation, we want a derived let for binding 
terms in terms, and we can define it as the function letTerm (which 
is defined by mutual recursion with letPrim) in Figure 6. 

Finally, we give the overall term translation in Figure 7. 

The definition is deceptively easy; Coq accepts it with no further 
fuss, which implies that a proof of type preservation for the trans¬ 
lation is implicit in the translation’s definition. The trick to making 
this work lies in taking advantage of our freedom to pick smart in¬ 
stantiations for the variable type family V. In particular, we see an 
odd variable type choice in the type of the term translation. 

For a given V that we want to use for the resulting CPS term, 
we choose to type source variables with the function Vo [J, 
the composition of V with the type translation function. That is, 
while the translation takes source terms as input, we interpret their 
variables in a CPS-specific way. The source term is handed to us 
in parametric form, so it is no problem to choose its V to be the 
correct function. In doing so, we find that each variable in the term 
we produce has exactly the right type to use as a CPS variable in 
the translation result. 

Our translation is a term of CIC, and CIC’s strong normalization 
theorem implies that any application of the translation to a concrete, 
well-typed source term can be normalized to a concrete, well-typed 
CPS term. Coq will perform such normalizations for us automat¬ 
ically, during proofs and in independent queries. Coq will extract 
our translation to executable OCaml or Haskell code automatically. 

Figure 8 gives the Coq code for the main translation. We again 
make use of sections, where, conceptually, we fix for the whole 
section the V choice var that we are compiling into, and, when we 
close the section, the function cpsTerm is extended to take var as 




(A- M)[r] ~ |a| 


L-J 

LMJ 

Ltruej 
[false] 
Lei e 2 J 


LA/J 


L-J 

L-BJ 


: W : cpsType —> *. Vt : type. 

term(V o L-J) T —> cpsTerm(V) LtJ 
= halt(ai) 

= let x = true in halt(ar) 

= let x = false in halt(ai) 

= letTerm LeiJ (A/. 
letTerm Le2j (A*, 
let k = A r. halt(r) in 
let p = (x , k) in 

fp )) 

= let / = \p. 

let x = Trip in 
let k = TT2P in 
letTerm [f x\ (Ar. 

kr) 

in halt(/) 

: Vr : type.Term r —» CpsTerm [rj 

= AV : cpsType ^ L^ (Vo [J)J 

Figure 7. CPS translation for STLC 


Section cpsTerm. 

Variable var : ptype -> Type. 

Fixpoint cpsTerm t 

(e : term (fun t => var (cpsType t)) t) 
{struct e} : pterin var (cpsType t) : = 
match e in (term t) 

return (pterm var (cpsType t)) with 
I EVar _ v => PHalt (var := var) v 
I ETrue => x <- Tru; Halt x 
I EFalse => x <- Fals; Halt x 
I EApp _ _ el e2 => 
f <— cpsTerm el; 
x <— cpsTerm e2; 
k <- \r, PHalt (var := var) r; 
p <- [x, k]; 
f @0 p 

I EAbs _ e’ => 

f <- PAbs (var := var) (fun p => 
x <- #1 p; 
k <- #2 p; 

r <— cpsTerm (e’ x); 
k @0 r); 

Halt f 

End cpsTerm. 


Figure 8. Coq code for the STLC CPS translation 


(A_. bool)[r] bool (Aa. |a|)[r] M- r 


ri[r]h-»n T2 [t] i > t*2 
(Aa. n(a) -*■ r 2 (a))[r] ^ T [ t 4 
Va. (Aa'.T 1 (a , )(a))[r] r((a) 

(Aa. Vn(a))[r] i-+ Vr( 

Figure 9. Relational type variable substitution judgment 


an extra argument. We use syntactic sugar for CPS terms that is 
defined elsewhere. Sometimes we need to drop down to using the 
raw constructors of the CPS language to help type inference. For 
instance, the snippet PHalt (var := var) v gives an explicit 
value for the implicit parameter var of the halt term constructor. 

2.3 CPS Translation for System F 

We can extend the development from the last subsection to arrive 
at a CPS translation for System F. The first wrinkle is that now 
the definition of the type languages becomes nontrivial, thanks 
to the presence of type variables. Fortunately PHOAS adapts to 
this change quite naturally, and we can produce the following 
revised version of our source type language definition, where the 
parameter T has type *. From this point on in the paper, we will 
avoid textual names for constructors of inductive types, instead 
introducing constructors with their syntactic shorthands. 


type(T) 

* 

■ 

T type(T) 

bool 

type(T) 

. -► ■ 

type(T) -*• type(T) ->• type(T) 

V- 

("B —> type(T)) —> type(T) 


We have a variable injection form |a| as we had before at the 
term level, and we have a universal type constructor V/. We will 
write Va. r as shorthand for V(Aa. r). 

Now we would like to define the syntax and typing of terms. 
To do this, we need to implement substitution of types for type 
variables in types. We cannot use the substitution function strategy 
from Section 2. The situation is roughly that, once we fix a partic¬ 
ular variable type, we cannot implement as many syntactic func¬ 
tions, and we must fix a variable type before we can deconstruct 
syntax recursively. We can, however, implement some of these syn¬ 
tactic operations relationally after fixing a variable type. We will 
define substitution relationally with inference rules, following the 
approach used by Despeyroux et al. (1995). 

In Figure 9, we define a judgment n [r 2 ] i—> tz , where n is a 
function from type variables to types, and t 2 and tz are types. The 
meaning is that substituting r 2 for the type variable that n abstracts 
over leads to tz. The functions that we give for n are interpreted as 
functions of the meta language, with the usual variable convention, 
so that, e.g., the functions Aa. |a| and A . |a| are provably distinct, 
as long as our domain of type variables has at least two elements. 

Now we can define the syntax of terms, which are parameter¬ 
ized on two different variable types, in Figure 10. We have type 
abstractions A / for functions / from type variables to types, and 
we have type application e[r] for term e with a V type. The type 
of the type application constructor includes r', the type that results 
from substituting the type argument in the body of e’s V type. Not 
only that, but it is necessary to pass a first-class substitution proof 
to the type application constructor. The type (n[r 2 ] i —* t') stands 




term(T, V) 

H 

type(T) -> * 

Vr : type(T). V(r) -> term(T, V) r 

true 

term(T, V) bool 

false 

term(T, V) bool 


Vn,r 2 : type(T). term(T, V) (n —> r 2 ) 

—> term(T, V) ti —> term(T, V) r 2 

A • 

Vn,T 2 : type(T). (V(n) — term(T, V) r 2 ) 
—► term(T, V) |(ri —> r 2 ) 

•[•] 

: Vn : T —> type(T). Vr 2 ,r' : type(T). 
term(T, V) (Vn) - (n[r 2 ] ^ r') 

—> term(T, V) t 

A- 

: Vt :T —► type(T). 

(Va : T. term(T, V) (r a)) 

—> term(T, V) (Vr) 


L-J : VT:*. VV : cpsType(T) ^ Vr : type(T). 

term(T, V ° |_-J) r -► cpsTerm(T, V) |rj 
Le[r]J = let Term [ej (A/. 

let/' = /[Lrj]in 
let k = A r. halt(r) in 

f k) 

LA ej = let / = A a. A k. 

letTerm [e a\ (Aw. 

in halt(/) 

[ J : VT : Type.Term T —> CpsTerm [TJ 
LTJ = XT:*. XV: cpsType(T) -^*. \_ET (Vo [-J) j 

Figure 11. Selected cases of CPS translation for System F 


Figure 10. PHOAS syntax definitions for System F 


for proofs of that particular proposition. To support building values 
of this type, we can formalize the substitution inference rules quite 
directly in Coq with an inductive definition. We can then build sub¬ 
stitution proofs for concrete terms quite easily by interpreting the 
judgment definition as a logic program, and we can prove general 
lemmas about substitutions where the proofs are fully automated 
using logic programming. Using Coq’s extraction mechanism, we 
can build an executable version of a translation where proof terms 
have been erased in a sound way. Thus, first-class proofs impose no 
runtime overhead, in contrast to the situation with, e.g., analogous 
implementations using GADTs in GHC Haskell. 

Finally, we can define the packaged PHOAS versions of types 

Type * 

Type = VT:*. type(T) 

Term Type —» * 

Term = XT : Type. VT : *. VV : type(T) —> *. 
term(T, V) (T T) 


We define a CPS version of System F, following the same 
conventions as in the last subsection. Here we will onlv give the 


grammar for this language: 

Types r ::= 

Terms e ::= 

Primops p ::= 

Term functions / 

Primop functions g 


a | bool | r —> 0 | r x r | Va. r 
halt(*) | x x | let p in / 

|a:| | true | false | A/ 
j (x,x) | 7T1* | 7T2* | x[t] | A g 


Now we can adapt the CPS translation from the last subsection 
very naturally. Here is the new type translation: 

L-J : VT:*. type(T) cpsType(T) 


LboolJ = bool 

Lri -> r 2 J = (L*iJ x (Lt 2 J -*• 0)) 0 

LVtJ = Va. (Lr(a)J - 0) - 0 


We apply a standard double negation transform (Harper and 
Lillibridge 1993) to V types, which moves the type’s body into a 
position where we can quantify over its free type variable without 


running afoul of the functions-never-return property of the CPS 
language. 

The letTerm and letPrim functions of the last subsection are 
readily adapted to System F, and we use them in the adapted term 
translation, whose new cases are shown in Figure 11. 

Writing the term translation this way elides one detail. Like 
at the source level, building a CPS type application term requires 
providing a proof that a particular type variable substitution is valid. 
We prove the following theorem, where the substitution notation is 
overloaded for both the source and CPS languages: 

Theorem 1 (CPS translation of substitution proofs). For all n, 
T2, andT3, t/ri[r 2 ] r 3 , then (A a. L r i( a )J)[L r 2j] l-> L r 3j- 
A straightforward induction on the derivation of the premise 
proves Theorem 1. In Coq, the proof is literally just a statement 
of which induction principle to use, chained onto an invocation of 
a generic simplification tactic from the Lambda Tamer library. The 
real term translation references this theorem explicitly to build a 
CPS substitution proof from a source substitution proof. 

2.4 Pattern Match Compilation 

To have any hope of handling real programming languages, PHOAS 
must be able to cope with constructs that bind multiple variables 
at once in complicated ways. ML-style pattern matching provides 
a familiar example. In this subsection, we will show how to imple¬ 
ment a compilation from pattern matching to a more primitive type 
theory. Our source language’s grammar, in usual informal notation, 

Types r ::= unit | r —> r | r x r | r + r 

Patterns p ::= x \ (p,p) | ini p | inrp 

Terms e ::= * | () | e e | A*, e | (e, e) | ini e | inr e 

| (case e of p =>■ e | _ =>■ e) 

To avoid dealing with inexhaustive match failures, we force case 
expressions to include default cases. Also, while the syntax allows 
a variable to appear multiple times in a pattern, our concrete CIC 
encoding makes pattern variables unique by construction. 

The key issue in formalizing this language is deciding how 
to represent patterns and their uses to capture binding structure 
correctly, without making it too hard to write transformations. We 
start in Figure 12 by defining patterns similarly to terms from 
Section 2.2, but with an extra type index giving the types of the 
variables that a pattern binds. This index has type list type, using 
the Coq list constructor, with “cons” operator :: and concatenation 
operator ®. 





pat(V) 

type — > list type — > * 

H 

Vr : type. pat(V) r [r] 

{;■) 

Vn,T2 : type. Vfl,f2 : list type. 
pat(V) ri rl -> pat(V) r 2 t 2 
— » pat(V) (n x t 2 ) (n © f 2 ) 

ini ■ 

: Vti,T2 : type. Vf : list type. 

pat(V) n r —> pat(V) (n + t 2 ) t 

inr • 

: Vn, T2 : type. Vf : list type. 

pat(V) r 2 f —> pat(V) (ri + r 2 ) f 


Figure 12. Pattern syntax 


To make use of this binding information in the type we give case 
expressions, we will need an auxiliary type definition. The indexed 
heterogeneous list type family tuple is defined in the Lambda 
Tamer library: 

tuple : VT : *. (T -* *) -> list T —► * 
tuple / [] = unit 

tuple / (h :: t) = fhx tuple f t 

We can give the case constructor the following type, using tuple 
to represent groups of variables being bound at once: 
case • of • => • | _ => • : Vn,T2 : type. term(V) n 

—> list (Sr. pat(V) n F 
x (tuple V t —> term(V) T2)) 
- term(V) r 2 -+ term(V) r 2 

The list argument is the interesting one. We represent the pattern 
matching branches as a list of pairs of patterns and expressions. We 
need to use a E dependent pair type to enforce the relationship 
between the types of the variables a pattern binds and the types 
that the corresponding expression expects. The type family tuple V 
translates a list of types into the type of a properly-typed variable 
for each. 

The elaborated version of this language is a small variation on 
the source language of Section 2.2. We give only the syntax, in 
standard informal style: 

Types r ::= unit | r —> r | r x r | r + r 

Terms e ::= x | () | e e | Xx. e 

| (e, e) | 7ne | 7r 2 e | ini e | inr e 
j (case e of ini x => e | inr x => e) 


We want to translate pattern matching in a way that avoids 
decomposing the same term twice in the dynamic execution of 
the translation of the same source case expression. To do this, 
we will use an intermediate representation of patterns that maps 
every possible “shape” of the discriminee to the proper expression 
to evaluate. elabTerm is the type family for terms of the target 
language. 


ctree(V) 
ctree(V) (n x r 2 ) T 
ctree(V) (n + r 2 ) T 
ctree(V) r T 


type -+ * -*■ * 
ctree(V) n (ctree(V) r 2 T) 
ctree(V) n T x ctree(V) t 2 T 
elabTerm(V) r —> T 


ctree expands a type into a possibly exponential number of 
functions, one for each shape of that type. For instance, for any 
T, ctree(V) ((unit+ (unit —> unit)) x unit) T reduces to the term 
in Figure 13. 


xPat(V) 


xPat(V) |x| a f 
xPat(V) {pi,p 2 ) sf 


xPat(V) (ini p) s f 
xPat(V) (inr p) s f 


Vr : type. VF : list type. VT : *. 
pat(V) r F 

-► (tuple (elabTerm(V)) F—► T) 
—> T —► ctree(V) r T 
everywhere (Ad. s (d, ())) 
xPat(V) pi 
(AaJl. xPat(V) p 2 
(Xx 2 . s (afi ® x 2 )) f) 
(everywhere (A_. /)) 

(xPat(V) p s f, everywhere (A_. /)) 
(everywhere (A_. /),xPat(V) p s f) 


Figure 14. Pattern compilation 


We can define a translation of pats into ctrees, as in Figure 
14. The last two arguments of the function xPat are success and 
failure continuations. We overload © to denote concatenation of 
tuples. We use an auxiliary function everywhere, which takes an 
expression that needs no more free variables provided and builds a 
ctree that maps every shape to that expression. 

The xPat function is the essence of the translation. The remain¬ 
ing pieces are a way of merging ctrees, a way of expanding them 
into expressions of the target language, and the translations for the 
kinds of expressions beyond case, which are very straightforward. 
As usual, all of these pieces have static types that guarantee that 
they map well-typed syntax to well-typed syntax. 

2.5 Closure Conversion for Simply-Typed Lambda Calculus 

The last three subsections have demonstrated how PHOAS sup¬ 
ports convenient programming with a variety of different kinds of 
variable binding. In every one of these examples, details of vari¬ 
able identity have been unimportant. However, there are important 
classes of language formalization problems where variable iden¬ 
tity is central. The example that we will use in this subsection is 
closure conversion, which has long been regarded as a tricky chal¬ 
lenge problem for HOAS. In Twelf, closure conversion might be 
formalized using syntactic marker predicates over variables or an¬ 
other approach that leaves binding completely higher-order. In con¬ 
trast, here we will use the opposite approach. To write particular 
functions, we can choose the variable type V to follow any of the 
standard first-order representation techniques. That is, PHOAS can 
function as something of a chameleon, passing back and forth be¬ 
tween first-order and higher-order representations as is convenient. 

To implement closure conversion for STLC, we chose to work 
with V as nat, the type of natural numbers, using a particular 
convention for choosing the number to assign to a variable each 
time we “go inside a binder.” Variable numbers will be interpreted 
as de Bruijn levels, where a variable’s number is calculated by 
finding its binder and counting how many other binders enclose 
the original binder. This gives us the convenient property that a 
variable’s number is the same throughout the variable’s scope. The 
encoding enjoys similar properties to de Bruijn indices (de Bruijn 
1972), which count binders starting from a variable use and moving 
towards the AST root rather than in the opposite direction, but de 
Bruijn levels are more convenient in this case. 

Since we are being explicit about variable identity, we can no 
longer get away with relying only on CIC’s parametricity in defin¬ 
ing the translation. We need to define a notion of well-formedness 




(elabTerm(V) unit —> elabTerm(V) unit —> T) x (elabTerm(V) (unit —> unit) —> elabTerm(V) unit —» T) 
Figure 13. Example normalized ctree type 


of de Bruijn level terms. Later we will prove that every parametric 
term is well-formed when instantiated with V as nat. 

Our actual implementation of closure conversion is meant to 
come after CPS conversion in a compilation pipeline, but here we 
will treat the source language of Section 2.2 instead for simplicity. 
We define well-formedness with a recursive function over terms, 
parameterized on an explicit type environment. The judgment is 
also parameterized on a subset of the in-scope variables. The in¬ 
tended meaning is that only free variables included in this subset 
may be used. It is important that we are able to reason about well- 
formedness of terms in this way, because we will be choosing sub¬ 
sets of the variables in a term’s environment when we pack those 
terms into closures. 

isfree = tuple (A_ : type, bool) 

wf : VT : list type. V7 : isfree T. Vt : type. 
term(nat) t —> * 

wf 7 |n| = 7.n = r (where r is the type passed in for |n|) 

wf 7 true = unit 

wf 7 false = unit 

wf 7 (ei ez) = wf 7 ei x wf 7 e2 

wf 7 (Ae) = wf (true, 7) (e(length 7)) 

The notation 7 .n = r stands for the type of first-class proofs that 
the nth variable from the end of the associated T is assigned type 
t and marked as present in 7. Since we number variables from the 
end of T, the Ae case passes the term function e the length of 7 as 
the value of its new free variable, while we extend 7 with true to 
indicate that the new variable may be referenced. 

Another central definition is that of the function for calculating 
the set of variables that occur free in a term. We use this function 
in the translation of every function abstraction, populating the clo¬ 
sure we create with only the free variables. We use auxiliary func¬ 
tions isfree none, which builds an isfree tuple of all false values; 
isfree_one, which builds an isfree tuple where only one position 
is marked true, based on the natural number argument passed in; 
and isfree_merge, which walks two isfrees for the same T, applying 
boolean “or” to the values in each position. 

fvs Vr : type, term (nat) r—> VT : list type, 
isfree T 

fvs |n| r = isfree_one n 
fvs true T = isfree_none 
fvs false T = isfree_none 
fvs (ei e2) T = isfree_merge (fvs ei T) (fvs e2 T) 
fvs (Ae) T = 7T2 (fvs (e (length T)) (r :: T)) 

(where r is the abstraction domain type) 
We need to prove a critical theorem about the relationship be¬ 
tween wf and fvs, even if we just want to get our closure conversion 
function to type-check: 

Theorem 2 (Minimality of fvs). For any T and t, any 7 for T, 
and any e of type t, if wf 7 e, then wf (fvs e T) e. 

The proof is based on two main lemmas. The first of them 
asserts that, when a term e is well-formed for any set of free 
variables for T, that term is also well-formed for every set of free 
variables containing fvs e T. The second lemma asserts that, when 


a term is well-formed for any set of free variables 7 , every variable 
included in fvs e T is also included in 7 . Both of these lemmas 
are proved by induction on the structure of the term, appealing to a 
few smaller lemmas characterizing the interactions of the isfree.* 
functions with variable lookup. 

The last element we need to present the closure conversion is a 
type of explicit environments. We implement a type family envOf 
by recursion over an isfree value. The resulting environment type 
contains variables for exactly those positions marked with true. 

envOf(V) VT : list type, isfree r —> * 
envOf(V) [] () 4 'linit 

envOf(V) (r :: T) (true, 7 |' fc V(r) x envOf T 7 
envOf(V) (r :: T) (false, 7 |> ; i= envOf T 7 

The full closure conversion involves a lot of machinery, so we 
will only present some highlights here. The type of the translation 
builds on the familiar form from the previous subsections: 
xTerm Vt : type. Ve : term(nat) t. 

VT : list type. V 7 : isfree T. 
wf e 7 —> envOf T 7 —> ccTerm(V) |_ r J 
To translate a term, we must provide both a superset of its free 
variables and a well-formedness proof relative to that set. We can 
see why we want to require these proofs by looking at the transla¬ 
tion case for variables. We use the meta-variables <t> to stand for wf 
proofs and a to stand for envOf explicit environments. 

xTerm \n\ <j> o = \o(n) ~ <j>\ 

Here we use the notation e ~ </> to denote the casting of a CIC 
expression e of type r to type r', by way of presenting an explicit 
proof (f> that r = r'. This is exactly the kind of proof provided to 
us by the variable case of wf. 

To have these equality proofs available at the leaves of a term, 
we need to thread them throughout the translation, in cases like that 
for function application: 

xTerm (ei ef) <f> cr = (xTerm ei {tti(j>) cr) 

(xTerm ei (pri(j>) cr) 

We used a pair type in the wf definition, applying it Curry-Howard 
style as the type of proofs of a conjunction of two other wf deriva¬ 
tions. Now we can use 7 ri and tt 2 to project out the two sub-proofs 
and apply them in the recursive xTerm calls. 

Most of the action in closure conversion happens for the func¬ 
tion abstraction case. We present a very sketchy picture of this case, 
since there are many auxiliary functions involved that are not par¬ 
ticularly surprising. 

xTerm (Ae) 0 cr = let / = \x en v Xx a r g . 


in xTerm (e (length cr)) 

(wfFvS 0) (x a rg,x) 
in / (makeEnv (wfFvs' cj>) cr) 

The basic idea is that each function is modified to take its environ¬ 
ment of free variables as a new first argument, which is a tuple of 
the appropriate size and types. The notation let x = x env is for a 
recursive function that binds all of the constituent free variables into 
genuine variables by pulling them out of the tuple x en v With those 






variables bound, we can translate the function body e. We use a 
function wfFvs for translating the proof of e’s well-formedness for 
an arbitrary 7 into a proof for e’s real set of free variables, using 
Theorem 2. The explicit environment we pass to xTerm is formed 
by adding e’s “real” argument x arg to the front of an environment 
built from the x. Finally, we unpack the closure we have built, using 
a function makeEnv to build an explicit environment by choosing 
values out of a. We need another auxiliary proof-translation func¬ 
tion wfFvs', again based on Theorem 2. 

Our actual closure conversion case study works on the output 
of Section 2.2’s CPS translation, which adds additional compli¬ 
cations. We also combine the translations commonly called “clo¬ 
sure conversion” and “hoisting” into a single translation, moving 
all function definitions to the top level at the same time that we 
change them to take their environments as arguments. If we im¬ 
plemented the phases separately, it would be harder to enforce that 
functions are really closed, since they would have some “off-limits” 
variables in their PHOAS scopes. With our implementation, the 
type of the closure conversion function guarantees not only that 
well-typed syntax is mapped to well-typed syntax, but also that the 
output terms contain only closed functions. 

We export the final closure conversion as a function over 
universally-typed packages, so the messy use of de Bruijn lev¬ 
els is hidden completely from the outside. The closure conversion 
implementation is essentially working with a first-order represen¬ 
tation, and more code is needed than if we had fixed a first-order 
representation from the start, since we need to convert our higher- 
order terms into their first-order equivalents. Thus, PHOAS would 
not make sense for an implementation that just did closure conver¬ 
sion. The benefit comes when one part of an implementation needs 
first-order terms, while other parts are able to take advantage of 
higher-order terms. PHOAS allows us to compose the two kinds of 
phases seamlessly, where phases need not telegraph through their 
types which representations they choose. 

3. Proving Semantic Preservation 

Writing the translations of the last section with dependently-typed 
abstract syntax has given us all of the benefits of type-preserving 
compilation, without the need to rely on ad-hoc testing to dis¬ 
cover if our translations may sometimes propagate type annota¬ 
tions incorrectly. We would like to go even further and provide 
the classic deliverable of compiler verification, which is proof of 
semantic preservation. For some suitable notion of program mean¬ 
ing for each language we manipulate, we want to know that the 
output of a translation has the same meaning as the input. Follow¬ 
ing our past approach (Chlipala 2007), we choose a denotational 
style of meaning assignment that has been called type-theoretic se¬ 
mantics (Harper and Stone 2000). That is, we provide definitional 
compilers from all of the languages we formalize into CIC, and we 
construct machine-checked proofs using Coq’s very good built-in 
support for reasoning about the terms of CIC, in contrast to work¬ 
ing with an explicit operational or denotational semantics for it. 
Past uses of type-theoretic semantics have tended to use custom- 
tailored type theories, while we use a small, “universal” type theory 
for all object languages; hence, we call our approach foundational 
type-theoretic semantics. 

We want to automate our proofs as far as possible, to minimize 
the overhead of adding new features to a language and its certified 
implementation. Towards this end, we have implemented a number 
of new tactics using Coq’s tactical language (Delahaye 2000). This 
is a dynamically-typed language whose most important feature is a 
very general construct for pattern matching on CIC terms and proof 
sequents, with a novel backtracking semantics for pattern match 
failure. 


[•] = Wpe — * 

[bool = bool 

[ri -*■ t 2 ] = H -*■ [r 2 ] 

[•] : Vr : type, term® f -> [r] 

[true] = true 

[false] = false 

[ei e 3 | = [ei] [e 2 ] 

[Ae] = Ac. [e(a:)] 

|j : Vr : type. Term r —> [r] 

fei = \e a 

Figure 15. Denotation functions for STLC 


The Lambda Tamer library contains about 100 lines of tactic 
code that we rely on in the proofs that we will sketch in this section. 
Most proofs are performed by looping through a number of differ¬ 
ent simplification procedures until no further progress can be made, 
at which point either the proof is finished or we can report the set of 
unproved subgoals to the user. The core simplification procedures 
we use are simplification of propositional structure, application of 
CIC computational reduction rules. Prolog-style higher-order logic 
programming, and rewriting with quantified equalities. We add a 
few tactics that simplify goals that use dependent types in tricky 

We also add a quantifier instantiation framework. It can be hard 
to prove goals that begin with 3 quantifiers or use hypotheses that 
begin with V quantifiers, because we need to pick instantiations for 
the quantified variables before proceeding. Thus, it is helpful to 
provide a quantifier instantiation tactic parameterized on a function 
that chooses an instantiating term given a CIC type. A default 
strategy of picking any properly-typed term that occurs in the goal 
works surprisingly well because of the rich dependent types that we 


3.1 CPS Translation for Simply-Typed Lambda Calculus 

The correctness proof for the CPS translation of Section 2.2 is the 
simplest and involves almost no code that is not a small constant 
factor away from the complexity of a standard pencil-and-paper 
solution. We argue that the proof is actually simpler than it would 
be on paper, because automation takes care of almost all of the 
details. The human proof architect really only needs to suggest 
lemmas and the right induction principles to use in proving them. 

Before we can prove the correctness of the translation, we need 
to give dynamic semantics to our source and target languages. We 
can write a very simple denotation function for the source language, 
this time overloading the notation [•] for the denotation functions for 
types and terms, as shown in Figure 15. 

There is an interesting development hidden within the superfi¬ 
cially trivial definition of the term denotation function. We choose 
the variable type family V to be the type denotation function. That 
is, we work with syntax trees where variables are actually the de¬ 
notations of terms. This is a perfectly legal choice of variable type, 
as shown in the final line above, which defines the denotation of a 
universally packaged term. As a result, the translation of variables 
is trivial, and the translation of function abstractions can use a CIC 
binder which passes the bound variable directly to the syntactic ab¬ 
straction body e. 

Figure 16 gives the Coq code corresponding to Figure 15. 



Fixpoint typeDenote (t : type) : Set := 
match t with 

I TBool => bool 

I TArrow tl t2 => typeDenote tl -> typeDenote t2 
end. 

Fixpoint termDenote t (e : term typeDenote t) 

{struct e} : typeDenote t := 
match e in (term _ t) return (typeDenote t) with 
I EVar _ v => v 
I ETrue => true 
I EFalse => false 
I EApp el e2 => 

(termDenote el) (termDenote e2) 

I EAbs _ _ e’ => fun x => termDenote (e’ x) 

Definition Term t := forall var, term var t. 

Definition TermDenote t (E : Term t) := 
termDenote (E _). 


Figure 16. Coq code for STLC denotation functions 


For space reasons, we omit the details of the semantics for the 
CPS language. It is in a slightly different form, where the meaning 
of term e of type r is written [e]fc for some continuation k of type 
[t] —> bool. The final result of evaluating e is thrown to k, which 
returns a boolean, the simplest type that we can use to express the 
results of a variety of possibly-failing tests. 

To state the semantic correctness theorem, we define a standard 
semantic logical relation, by recursion on the structure of syntactic 
types: 

: Vr : type, [r] -> [|tJ] —► * 

ft cy ri f 2 = Vxi. \/x 2 ■ x\ ~ ri x 2 —* Vfc. 3 r. 

fi (x 2 , k) = k r A /i *1 r 

Now we can state semantic correctness as: 

THEOREM 3 (Semantic correctness). For every t : type and E : 
Term t, for any continuation k : [|_rj] —> bool, there exists 
r : [|_t|] such that = k r and |E] ~ r r. 

When we specialize the theorem to object language type bool, we 
get that, for any E : Term bool, [L-EJ](A6. b) = [EJ. To convince 
ourselves that this is the result we wanted, we only need to con¬ 
sider adequacy of the definitions of the syntax and semantics of 
the source and target languages, assuming that we believe that any 
compiler errors can be detected by boolean tests. This simplifica¬ 
tion will be even more welcome in the proofs of more complicated 
translations, where we do not want to include details of the logical 
relations we choose in the trusted code base. 

Thinking informally, we can prove Theorem 3 by induction 
on the structure of E, relying on one other inductively-proved 
lemma about the correctness of letTerm. Unfortunately, Coq has 
no concept of an induction principle for a function type, and that 
is how we are representing terms. Twelf’s meta logic is all about 
supporting induction over types involving function spaces. Can 
we import some of that convenience to Coq? In the rest of this 
subsection, we focus on how to do that by introducing a kind 
of explicit well-formedness relation on terms. We can assert as 
an axiom that every term is well-formed, and indeed we believe 
that this is a consistent axiom, thanks to parametricity of CIC. 
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Figure 17. Term equivalence inference rules 


Nonetheless, we have no proof of this worked out. It may be 
possible to adapt the proof for the Theory of Contexts (Bucalo et al. 
2006) or apply the techniques suggested by Hofmann (1999) for 
reasoning about HO AS. Regardless of the form that our confidence 
in the axiom takes, it might be worthwhile to consider an extension 
to Coq that makes this axiom provable within the logic, in much the 
same way that support for inductive types was added to an earlier 
version of Coq, but we leave that for future work. 

At the same time, asserting the axiom is really only a con¬ 
venience in our development. We could restate our theorems to 
require that the “axiom” holds when applied to any terms that 
are mentioned, and we could prove that our translations preserve 
well-formedness. This clutters PHOAS developments, but it is not 
hard to imagine some new support from Coq for automating these 
changes, letting the proof developer work with notation like what 
appears in our current development. We would be doing extra work 
to prove lemmas that seem likely to be instances of more general 
meta-theorems, but we would at least avoid explicit dependence on 
axioms. Any ground instance of the axiom is provable easily by a 
simple logic program. 

In any case, it is convenient to have a well-formedness judgment 
defined for our PHOAS terms. Rather than define a traditional well- 
formedness judgment directly, we instead formalize what it means 
for two terms with different concrete choices for V to be equivalent. 
We say that a universally-packaged term is well-formed if and only 
if any choice of two V instantiations leads to a pair of equivalent 
terms. We denote the judgment as T h ej S e 2 , where T is a set 
of pairs of variables from the V types of ei and e 2 . In Figure 17, 
we give the equivalence judgment for STLC in the usual informal 
natural deduction style, omitting some complications arising from 
typing. Now we assert as an axiom that, for any E : Term r and 
any types Vi and V 2 , 0 h E Vi = E V 2 . 

The judgment = suggests a useful proof strategy for Theorem 
3. By inducting over = derivations, we can in effect perform par¬ 
allel induction over a term E, where at each stage we have several 
versions of E available, corresponding to different choices of V 
but known to share the same structure. To prove Theorem 3, it is 
useful to do parallel induction where we choose V to be both the 
source-level type denotation function and the target-level denota¬ 
tion function composed with the type translation. In particular, the 
main action happens in proving this lemma: 

LEMMA 1. For every T : type, ei : term([-]) r, e 2 : term(|-] o 
[•J) t, and set of variable pairs V: 

• IfV heiEe 2 

• And for every (x \, x 2 ) € T associated with source type r', it 
follows that x1 £± t / *2, 

• Then for any continuation k : [|_tJ] —> bool, there exists 
r : [|_tJ] such that [|_e 2 j]fc = k r anc/[eiJ ,~ T r. 

The proof script for this lemma involves stating a few proof hints, 
asking to induct on the = derivation, and calling the generic simpli¬ 
fication and quantifier instantiation tactic. We derive Theorem 3 as 



an easy corollary, producing the initial = proof by using the axiom 
we asserted. 

It is interesting to stop at this point and consider how “higher- 
order” this proof is. What have we gained over proofs with, for 
instance, nominal or de Bruijn representations? The main induc¬ 
tion principle comes from the rules for the equivalence judgment, 
which includes a first-order list of variable pairs. Parts of the proof 
involve sub-proofs of membership in this list, which we can think 
of as isomorphic to natural numbers. Nonetheless, in crucial con¬ 
trast to nominal proofs, our proofs require no treatment of reason¬ 
ing about variable renamings or permutations; and, in contrast to de 
Bruijn proofs, the contexts of our equivalence judgment are freely 
reorderable, with no need to mirror reorderings in terms by tweak¬ 
ing variable indices. We could also go even further and parameter¬ 
ize the well-formedness judgment by a predicate on variable pairs, 
removing the explicit list and just requiring that free variable pairs 
satisfy the predicate. This solution ends up working a lot like the 
Twelf facility for defining regular worlds and seems to be “just as 
higher-order,” though as the subject of an axiom it is perhaps harder 
to believe. 

3.2 CPS Translation for System F 

We can extend this correctness proof to Section 2.3’s CPS transla¬ 
tion for System F. The main change is that we choose to do parallel 
induction with three different choices of the type variable type T; 
we consider versions of source types where variables are source- 
level type denotations, where variables are target-level type denota¬ 
tions, and where variables are relations between source- and target- 
level denotation types. We do this within an analogous parallel in¬ 
duction over terms. 

The idea is that, as we recurse through term structure, we stash 
the appropriate specialization of our logical relation for each free 
type variable in that variable in the third parallel version of the 

The logical relation from the last subsection is revised to deal 
properly with type variables and universal types. Instead of taking 
a single type as its main argument, it instead takes all three parallel 
versions of the source type. Figure 18 presents the relation. 

We have omitted some explicit casts and other details needed 
to get the dependent typing to work out. The key new lemma 
that we must prove about this logical relation, compared to the 
last subsection, is that for any V type body t and relation R over 
arbitrary types, ~ gives us the same relation when called with 
t(R) as when called with the result of substituting R for f’s free 
variable using the substitution relation •[•] i—> • from Section 2.3. 
Our proof of that lemma is hundreds of lines, since we do not yet 
have effective automation support for the uses of dependent typing 
that arise. With that lemma available, we prove the main theorem 
in a few dozen lines. 

3.3 Pattern Match Compilation 

The correctness proof for the pattern match compiler from Section 

2.4 is almost trivial once the translation is defined. We do not need 
a logical relation, because the source and target type systems are 
identical; the final theorem is stated with simple equality. We need 
to state a few lemmas and give the right induction principles to use 
to prove them, but the proof is almost entirely automatic. 

3.4 Closure Conversion for Simply-Typed Lambda Calculus 

As for pattern match compilation, we state the closure conversion 
correctness theorem for Section 2.5’s translation using equality. 
Since we have significantly more auxiliary functions than in the 
other examples, we need to state more lemmas about them, but the 
overall proof is again largely automated, relying on a few dozen 
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Figure 19. Lines-of-code counts for the object languages in the 
main case studies 
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Figure 20. Lines-of-code counts for the translations in the main 
case studies 


Feature 

unit 

X 

+ 

N 

Lists 

Source syntax 

5 

T8~ 

20 

16 

16 

Source semantics 

2 

4 

8 

8 

4 

Equivalence judgment 

2 

10 

11 

9 

7 

CPS syntax 

5 

18 

20 

16 

16 

CPS semantics 

2 

4 

8 

7 

16 

Translation 

5 

17 

17 

15 

39 

Logical relation 

1 

3 

6 

1 

11 

Lemmas 

0 

0 

0 

0 

26 

Proof hints 

1 

1 

1 

1 

15 


Figure 21. Lines-of-code counts for features added to the STLC 
CPS case study 


carefully-chosen proof hints. We can also prove that every term 
is well-formed in the sense of Section 2.5’s wf judgment, as a 
consequence of the standard higher-order well-formedness axiom 
that we assume. 

4. Measuring the Overhead of Formal Proof 

Implementations and correctness proofs for the translations of Sec¬ 
tions 2.2 through 2.5 are included in our source distribution. In this 
section, we summarize the amounts of code needed for the different 
pieces of these components and for some additional experiments. 

Figure 19 shows the number of lines of code used to formalize 
each language from the case studies. For each language, we show 
how many lines are needed to define its combined syntax and type 
system, along with how many lines are needed to give its dynamic 
semantics and how many lines are needed to define its equivalence 
judgment, if we needed one for that case study. For each translation, 
Figure 20 give the size of the translation proper along with the size 
of its correctness proof. The latter counts include both code to state 
theorems and code to prove them. 

We also ran some experiments with extending our STLC CPS 
conversion with a number of standard types from functional pro¬ 
gramming, measuring how much we had to change our implemen¬ 
tation to add each feature. Figure 21 gives the results. We added 
unit, with its single term constructor; product types, with a pair 
formation constructor and two projection operators; sum types, 
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Figure 18. Logical relation for System F CPS translation 


with ini and inr injections and case analysis; natural numbers, with 
“zero” and “successor” constructors and one-level case analysis; 
and lists, with “nil” and “cons” operators and built-in “fold left” 
functions. 

For each feature, we list how many lines were needed for its 
syntax/type system and dynamic semantics in the source and target 
languages, its inference rules for the source-level equivalence judg¬ 
ment, the CPS translation of its types and terms, its case in the log¬ 
ical relation used by the soundness proof, any extra lemmas used in 
that proof, and the proof hints added to be used by the automation 
machinery. Only adding list types required proving a new lemma, 
which has to do with folding over two lists in parallel, maintaining 
a binary relation between accumulators. The proof hints added are 
along the lines of “replace any variable of type unit with ()” and 
“when you see a pattern match on a value of sum type in the goal, 
try a case analysis on that value.” 

5. Related Work 

PFIOAS is weak HOAS (Despeyroux et al. 1995; Honsell et al. 
2001) where we replace a global type parameter with a parameter 
bound locally and instantiated with different values throughout a 
development. In both settings, we rely on axioms to prove seman¬ 
tic correctness theorems, though we need no axioms for our type 
preservation theorems with PHOAS. The ability to choose different 
concrete variable types for different contexts gives PFIOAS some 
additional power in both functional programming and proving. On 
the other hand, the axioms we assume for PHOAS are more com¬ 
plicated and language-specific than those weak HOAS assumes for 
the Theory of Contexts, though we could avoid the language speci¬ 
ficity by encoding all syntax in a single parameterized universal 
syntax type. 

Trifonov et al. (2000) used parametricity to facilitate an induc¬ 
tive definition of HOAS-style syntax for a language supporting in- 
tensional type analysis. They used kind polymorphism to rule out 
exotic terms in the encoding of type variable binders. 

Guillemette and Monnier (2008) used GHC Haskell to imple¬ 
ment a compiler with a proof of type preservation but not semantic 
preservation. The implementations of their transformations are very 
similar to ours. In one notable exception, they resort to first-order 
representation of type variables, to make theorems about substi¬ 
tution easier to prove. With Coq’s support for automating higher- 
order proofs, we are able to stick to higher-order variable represen¬ 
tation for both type and term variables. 

There have been many studies of the classic first-order variable 
binding representations within proof assistants, including studies 
using nominal syntax with two classes of variables in LEGO (Mck- 
inna and Pollack 1999), de Bruijn indices in LEGO (Altenkirch 
1993), nominal syntax in Isabelle/HOL (Urban and Tasson 2005), 
and locally nameless syntax in Coq (Aydemir et al. 2008). All of 
these first-order approaches involve extra syntactic bookkeeping in 
the definition of functions over syntax and the statement and proofs 


of theorems about syntax. While this overhead should be compared 
to our use of term equivalence relations in PHOAS, we only pay 
that cost in our semantic correctness proofs, and our meta language 
parametricity lets us manifest proofs of well-formedness judgments 
where needed, saving us from the standard first-order technique of 
threading such proofs throughout a development. 

Developments using HOAS have most commonly been done 
in Twelf (Pfenning and Schumann 1999), which supports logic 
programming, but not functional programming, over syntax. Tra¬ 
ditional HOAS removes the need to implement syntactic helper 
functions like substitution. We mostly get around the problem in 
PHOAS by sticking to type-theoretic formalizations that involve 
few low-level syntactic operations. Twelf’s meta-logic includes 
many features for reasoning about judgment contexts in induc¬ 
tive proofs; with PHOAS, we are reimplementing special cases 
of those features, with examples like the tern equivalence judg¬ 
ments parameterized on variable contexts. There have been sev¬ 
eral approaches proposed for functional programming over HOAS 
terms (Schumann et al. 2001; Pientka 2008), but they all involve 
creating new type systems rather than working within a general- 
purpose type theory like CIC, and their implementations are still 
immature and lacking in the kind of “proof assistant ecosystem” 
associated with tools like Coq, Isabelle, and Twelf. 

Several projects have considered “hybrid” approaches, where 
syntax is implemented with de Bruijn indices or another first-order 
technique at the lowest level, but a HOAS interface is built on top, 
including convenient induction principles. This has been imple¬ 
mented in Isabelle/HOL (Ambler et al. 2002), Nuprl (Barzilay and 
Allen 2002), Coq (Capretta and Felty 2006), and MetaPRL (Hickey 
et al. 2006). PHOAS has a close qualitative connection to these ap¬ 
proaches, as it also allows switching between first-order and higher- 
order views of terns, as demonstrated in our closure conversion. 

We already mentioned a few projects in compiler verification for 
first-order languages. The bibliography by Dave (2003) provides 
extensive pointers to other work. 

Minamide and Okuma (2003) verified CPS translations in Is¬ 
abelle/HOL, using a nominal representation, and Dargaye and 
Leroy (2007) used Coq to implement a certified CPS translation 
for a simply-typed lambda calculus with a number of ML-like fea¬ 
tures. The languages of the latter project are more realistic than 
those we have treated in our case studies, but the target language 
has the drawback that it includes two different classes of variables 
to make the translation easier to verify. Both projects pay the usual 
bookkeeping costs of first-order methods. 

Tian (2006) fomalized CPS translation correctness in Twelf. As 
Twelf contains no production-quality proof automation, the proofs 
are entirely manual, leading to a much larger development than in 
our corresponding case study. 



6. Conclusion 

We have shown how parametric higher-order abstract syntax 
(PHOAS) can be used to support convenient functional program¬ 
ming with the syntax of languages with nested variable binders. 
Proof assistants like Coq can be used to produce very compact 
and highly automated proofs of correctness for program transfor¬ 
mations implemented with PHOAS. Translations that need to take 
variable identity into account take more effort to write, but the en¬ 
coding allows for relatively direct implementations, and compiler 
phases that need variable identity can be composed with phases 
that do not, without sacrificing ease of development and proof for 
the latter category. 
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